Overview

Dataset statistics

Number of variables21
Number of observations34857
Missing cells100975
Missing cells (%)13.8%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory21.8 MiB
Average record size in memory657.0 B

Variable types

NUM13
CAT8

Reproduction

Analysis started2020-04-19 08:04:32.387150
Analysis finished2020-04-19 08:06:32.246303
Versionpandas-profiling v2.6.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
Dataset has 1 (< 0.1%) duplicate rows Duplicates
Suburb has a high cardinality: 351 distinct values High cardinality
Address has a high cardinality: 34009 distinct values High cardinality
SellerG has a high cardinality: 388 distinct values High cardinality
Date has a high cardinality: 78 distinct values High cardinality
Bedroom2 is highly correlated with RoomsHigh Correlation
Rooms is highly correlated with Bedroom2High Correlation
Price has 7610 (21.8%) missing values Missing
Bedroom2 has 8217 (23.6%) missing values Missing
Bathroom has 8226 (23.6%) missing values Missing
Car has 8728 (25.0%) missing values Missing
Landsize has 11810 (33.9%) missing values Missing
BuildingArea has 21115 (60.6%) missing values Missing
YearBuilt has 19306 (55.4%) missing values Missing
Lattitude has 7976 (22.9%) missing values Missing
Longtitude has 7976 (22.9%) missing values Missing
Landsize is highly skewed (γ1 = 96.02231136) Skewed
BuildingArea is highly skewed (γ1 = 99.13257937) Skewed
Car has 1631 (4.7%) zeros Zeros
Landsize has 2437 (7.0%) zeros Zeros

Variables

Suburb
Categorical

HIGH CARDINALITY
Distinct count351
Unique (%)1.0%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
Reservoir
 
844
Bentleigh East
 
583
Richmond
 
552
Glen Iris
 
491
Preston
 
485
Other values (346)
31902
ValueCountFrequency (%) 
Reservoir 844 2.4%
 
Bentleigh East 583 1.7%
 
Richmond 552 1.6%
 
Glen Iris 491 1.4%
 
Preston 485 1.4%
 
Kew 467 1.3%
 
Brighton 456 1.3%
 
Brunswick 444 1.3%
 
South Yarra 435 1.2%
 
Hawthorn 428 1.2%
 
Other values (341) 29672 85.1%
 

Length

Max length18
Mean length9.819175488
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 25 51.0%
 
Uppercase_Letter 23 46.9%
 
Space_Separator 1 2.0%
 
ValueCountFrequency (%) 
Latin 48 98.0%
 
Common 1 2.0%
 
ValueCountFrequency (%) 
ASCII 49 100.0%
 

Address
Categorical

HIGH CARDINALITY
UNIFORM
Distinct count34009
Unique (%)97.6%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
5 Charles St
 
6
25 William St
 
4
57 Bay Rd
 
3
3 Charles St
 
3
33 McCracken St
 
3
Other values (34004)
34838
ValueCountFrequency (%) 
5 Charles St 6 < 0.1%
 
25 William St 4 < 0.1%
 
57 Bay Rd 3 < 0.1%
 
3 Charles St 3 < 0.1%
 
33 McCracken St 3 < 0.1%
 
16 Clyde St 3 < 0.1%
 
9 Margaret St 3 < 0.1%
 
38 Lily St 3 < 0.1%
 
1/1 Clarendon St 3 < 0.1%
 
14 James St 3 < 0.1%
 
Other values (33999) 34823 99.9%
 

Length

Max length27
Mean length13.55136701
Min length8
ValueCountFrequency (%) 
Lowercase_Letter 26 40.6%
 
Uppercase_Letter 26 40.6%
 
Decimal_Number 10 15.6%
 
Other_Punctuation 1 1.6%
 
Space_Separator 1 1.6%
 
ValueCountFrequency (%) 
Latin 52 81.2%
 
Common 12 18.8%
 
ValueCountFrequency (%) 
ASCII 64 100.0%
 

Rooms
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count12
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.031012422
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Memory size272.4 KiB

Quantile statistics

Minimum1
5-th percentile2
Q12
median3
Q34
95-th percentile5
Maximum16
Range15
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9699329349
Coefficient of variation (CV)0.320002956
Kurtosis2.511708654
Mean3.031012422
Median Absolute Deviation (MAD)0.6920879164
Skewness0.4990968808
Sum105652
Variance0.9407698982
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 2.5 3.5 4.5 5.5 6.5 8.5 11. 16. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
3 15084 43.3%
 
2 8332 23.9%
 
4 7956 22.8%
 
5 1737 5.0%
 
1 1479 4.2%
 
6 204 0.6%
 
7 32 0.1%
 
8 19 0.1%
 
10 6 < 0.1%
 
9 4 < 0.1%
 
Other values (2) 4 < 0.1%
 
ValueCountFrequency (%) 
1 1479 4.2%
 
2 8332 23.9%
 
3 15084 43.3%
 
4 7956 22.8%
 
5 1737 5.0%
 
ValueCountFrequency (%) 
16 1 < 0.1%
 
12 3 < 0.1%
 
10 6 < 0.1%
 
9 4 < 0.1%
 
8 19 0.1%
 

Type
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
h
23980
u
7297
t
 
3580
ValueCountFrequency (%) 
h 23980 68.8%
 
u 7297 20.9%
 
t 3580 10.3%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Lowercase_Letter 3 100.0%
 
ValueCountFrequency (%) 
Latin 3 100.0%
 
ValueCountFrequency (%) 
ASCII 3 100.0%
 

Price
Real number (ℝ≥0)

MISSING
Distinct count2871
Unique (%)10.5%
Missing7610
Missing (%)21.8%
Infinite0
Infinite (%)0.0%
Mean1050173.345
Minimum85000
Maximum11200000
Zeros0
Zeros (%)0.0%
Memory size272.4 KiB

Quantile statistics

Minimum85000
5-th percentile415000
Q1635000
median870000
Q31295000
95-th percentile2250000
Maximum11200000
Range11115000
Interquartile range (IQR)660000

Descriptive statistics

Standard deviation641467.1301
Coefficient of variation (CV)0.6108202357
Kurtosis13.09720052
Mean1050173.345
Median Absolute Deviation (MAD)452081.7604
Skewness2.588969341
Sum2.861407313e+10
Variance4.11480079e+11
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
600000 235 0.7%
 
1100000 235 0.7%
 
650000 219 0.6%
 
800000 217 0.6%
 
1300000 210 0.6%
 
1000000 205 0.6%
 
1200000 204 0.6%
 
700000 197 0.6%
 
750000 194 0.6%
 
900000 191 0.5%
 
Other values (2861) 25140 72.1%
 
(Missing) 7610 21.8%
 
ValueCountFrequency (%) 
85000 1 < 0.1%
 
112000 1 < 0.1%
 
121000 1 < 0.1%
 
131000 1 < 0.1%
 
145000 2 < 0.1%
 
ValueCountFrequency (%) 
11200000 1 < 0.1%
 
9000000 1 < 0.1%
 
8000000 1 < 0.1%
 
7650000 1 < 0.1%
 
7000000 1 < 0.1%
 

Method
Categorical

Distinct count9
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
S
19744
SP
5095
PI
4850
VB
 
3108
SN
 
1317
Other values (4)
 
743
ValueCountFrequency (%) 
S 19744 56.6%
 
SP 5095 14.6%
 
PI 4850 13.9%
 
VB 3108 8.9%
 
SN 1317 3.8%
 
PN 308 0.9%
 
SA 226 0.6%
 
W 173 0.5%
 
SS 36 0.1%
 

Length

Max length2
Mean length1.428608314
Min length1
ValueCountFrequency (%) 
Uppercase_Letter 8 100.0%
 
ValueCountFrequency (%) 
Latin 8 100.0%
 
ValueCountFrequency (%) 
ASCII 8 100.0%
 

SellerG
Categorical

HIGH CARDINALITY
Distinct count388
Unique (%)1.1%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
Jellis
 
3359
Nelson
 
3236
Barry
 
3235
hockingstuart
 
2623
Marshall
 
2027
Other values (383)
20377
ValueCountFrequency (%) 
Jellis 3359 9.6%
 
Nelson 3236 9.3%
 
Barry 3235 9.3%
 
hockingstuart 2623 7.5%
 
Marshall 2027 5.8%
 
Ray 1950 5.6%
 
Buxton 1868 5.4%
 
Biggin 897 2.6%
 
Fletchers 861 2.5%
 
Woodards 714 2.0%
 
Other values (378) 14087 40.4%
 

Length

Max length27
Mean length6.291533982
Min length1
ValueCountFrequency (%) 
Uppercase_Letter 26 44.8%
 
Lowercase_Letter 25 43.1%
 
Other_Punctuation 5 8.6%
 
Decimal_Number 2 3.4%
 
ValueCountFrequency (%) 
Latin 51 87.9%
 
Common 7 12.1%
 
ValueCountFrequency (%) 
ASCII 58 100.0%
 

Date
Categorical

HIGH CARDINALITY
Distinct count78
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
28/10/2017
 
1119
17/03/2018
 
970
24/02/2018
 
941
9/12/2017
 
927
25/11/2017
 
902
Other values (73)
29998
ValueCountFrequency (%) 
28/10/2017 1119 3.2%
 
17/03/2018 970 2.8%
 
24/02/2018 941 2.7%
 
9/12/2017 927 2.7%
 
25/11/2017 902 2.6%
 
18/11/2017 866 2.5%
 
3/03/2018 846 2.4%
 
6/01/2018 787 2.3%
 
27/05/2017 770 2.2%
 
23/09/2017 742 2.1%
 
Other values (68) 25987 74.6%
 

Length

Max length10
Mean length9.714748831
Min length9
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

Distance
Real number (ℝ≥0)

Distinct count215
Unique (%)0.6%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean11.18492942
Minimum0
Maximum48.1
Zeros77
Zeros (%)0.2%
Memory size272.4 KiB

Quantile statistics

Minimum0
5-th percentile2.7
Q16.4
median10.3
Q314
95-th percentile24.7
Maximum48.1
Range48.1
Interquartile range (IQR)7.6

Descriptive statistics

Standard deviation6.788892456
Coefficient of variation (CV)0.6069678403
Kurtosis3.585924276
Mean11.18492942
Median Absolute Deviation (MAD)4.981600057
Skewness1.503585816
Sum389861.9
Variance46.08906078
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
11.2 1420 4.1%
 
13.8 681 2.0%
 
9.2 665 1.9%
 
7.8 662 1.9%
 
10.5 660 1.9%
 
8.4 604 1.7%
 
4.6 585 1.7%
 
14.7 566 1.6%
 
5.2 565 1.6%
 
11.4 521 1.5%
 
Other values (205) 27927 80.1%
 
ValueCountFrequency (%) 
0 77 0.2%
 
0.7 29 0.1%
 
1.2 47 0.1%
 
1.3 30 0.1%
 
1.4 6 < 0.1%
 
ValueCountFrequency (%) 
48.1 6 < 0.1%
 
47.4 7 < 0.1%
 
47.3 20 0.1%
 
45.9 33 0.1%
 
45.2 2 < 0.1%
 

Postcode
Real number (ℝ≥0)

Distinct count211
Unique (%)0.6%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean3116.062859
Minimum3000
Maximum3978
Zeros0
Zeros (%)0.0%
Memory size272.4 KiB

Quantile statistics

Minimum3000
5-th percentile3015
Q13051
median3103
Q33156
95-th percentile3204
Maximum3978
Range978
Interquartile range (IQR)105

Descriptive statistics

Standard deviation109.0239027
Coefficient of variation (CV)0.03498770971
Kurtosis22.78373808
Mean3116.062859
Median Absolute Deviation (MAD)66.16036392
Skewness4.018785705
Sum108613487
Variance11886.21137
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3073 844 2.4%
 
3046 638 1.8%
 
3020 617 1.8%
 
3121 612 1.8%
 
3165 583 1.7%
 
3058 556 1.6%
 
3040 535 1.5%
 
3204 518 1.5%
 
3163 508 1.5%
 
3012 497 1.4%
 
Other values (201) 28948 83.0%
 
ValueCountFrequency (%) 
3000 204 0.6%
 
3002 59 0.2%
 
3003 66 0.2%
 
3006 76 0.2%
 
3008 16 < 0.1%
 
ValueCountFrequency (%) 
3978 5 < 0.1%
 
3977 33 0.1%
 
3976 7 < 0.1%
 
3975 2 < 0.1%
 
3910 18 0.1%
 

Bedroom2
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
Distinct count15
Unique (%)0.1%
Missing8217
Missing (%)23.6%
Infinite0
Infinite (%)0.0%
Mean3.084647147
Minimum0
Maximum30
Zeros17
Zeros (%)< 0.1%
Memory size272.4 KiB

Quantile statistics

Minimum0
5-th percentile2
Q12
median3
Q34
95-th percentile5
Maximum30
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9806897285
Coefficient of variation (CV)0.3179260647
Kurtosis26.80745531
Mean3.084647147
Median Absolute Deviation (MAD)0.7010441044
Skewness1.406365679
Sum82175
Variance0.9617523437
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3 11881 34.1%
 
4 6348 18.2%
 
2 5777 16.6%
 
5 1427 4.1%
 
1 966 2.8%
 
6 168 0.5%
 
7 30 0.1%
 
0 17 < 0.1%
 
8 13 < 0.1%
 
9 5 < 0.1%
 
Other values (5) 8 < 0.1%
 
(Missing) 8217 23.6%
 
ValueCountFrequency (%) 
0 17 < 0.1%
 
1 966 2.8%
 
2 5777 16.6%
 
3 11881 34.1%
 
4 6348 18.2%
 
ValueCountFrequency (%) 
30 1 < 0.1%
 
20 1 < 0.1%
 
16 1 < 0.1%
 
12 1 < 0.1%
 
10 4 < 0.1%
 

Bathroom
Real number (ℝ≥0)

MISSING
Distinct count11
Unique (%)< 0.1%
Missing8226
Missing (%)23.6%
Infinite0
Infinite (%)0.0%
Mean1.624798168
Minimum0
Maximum12
Zeros46
Zeros (%)0.1%
Memory size272.4 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q32
95-th percentile3
Maximum12
Range12
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7242120115
Coefficient of variation (CV)0.4457242911
Kurtosis4.861008943
Mean1.624798168
Median Absolute Deviation (MAD)0.6141525403
Skewness1.356293032
Sum43270
Variance0.5244830376
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1 12969 37.2%
 
2 11064 31.7%
 
3 2181 6.3%
 
4 269 0.8%
 
5 77 0.2%
 
0 46 0.1%
 
6 16 < 0.1%
 
7 4 < 0.1%
 
8 3 < 0.1%
 
12 1 < 0.1%
 
(Missing) 8226 23.6%
 
ValueCountFrequency (%) 
0 46 0.1%
 
1 12969 37.2%
 
2 11064 31.7%
 
3 2181 6.3%
 
4 269 0.8%
 
ValueCountFrequency (%) 
12 1 < 0.1%
 
9 1 < 0.1%
 
8 3 < 0.1%
 
7 4 < 0.1%
 
6 16 < 0.1%
 

Car
Real number (ℝ≥0)

MISSING
ZEROS
Distinct count15
Unique (%)0.1%
Missing8728
Missing (%)25.0%
Infinite0
Infinite (%)0.0%
Mean1.728845344
Minimum0
Maximum26
Zeros1631
Zeros (%)4.7%
Memory size272.4 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q32
95-th percentile4
Maximum26
Range26
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.010770785
Coefficient of variation (CV)0.5846507837
Kurtosis20.85932625
Mean1.728845344
Median Absolute Deviation (MAD)0.7270760834
Skewness2.09517618
Sum45173
Variance1.021657581
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2 12214 35.0%
 
1 9164 26.3%
 
0 1631 4.7%
 
3 1606 4.6%
 
4 1161 3.3%
 
5 151 0.4%
 
6 140 0.4%
 
7 25 0.1%
 
8 23 0.1%
 
10 6 < 0.1%
 
Other values (5) 8 < 0.1%
 
(Missing) 8728 25.0%
 
ValueCountFrequency (%) 
0 1631 4.7%
 
1 9164 26.3%
 
2 12214 35.0%
 
3 1606 4.6%
 
4 1161 3.3%
 
ValueCountFrequency (%) 
26 1 < 0.1%
 
18 1 < 0.1%
 
12 1 < 0.1%
 
11 2 < 0.1%
 
10 6 < 0.1%
 

Landsize
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS
Distinct count1684
Unique (%)7.3%
Missing11810
Missing (%)33.9%
Infinite0
Infinite (%)0.0%
Mean593.5989934
Minimum0
Maximum433014
Zeros2437
Zeros (%)7.0%
Memory size272.4 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1224
median521
Q3670
95-th percentile1001
Maximum433014
Range433014
Interquartile range (IQR)446

Descriptive statistics

Standard deviation3398.841946
Coefficient of variation (CV)5.725821614
Kurtosis11580.16251
Mean593.5989934
Median Absolute Deviation (MAD)375.5721235
Skewness96.02231136
Sum13680676
Variance11552126.58
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 2437 7.0%
 
650 204 0.6%
 
697 123 0.4%
 
585 97 0.3%
 
700 86 0.2%
 
604 84 0.2%
 
534 81 0.2%
 
696 80 0.2%
 
652 68 0.2%
 
600 68 0.2%
 
Other values (1674) 19719 56.6%
 
(Missing) 11810 33.9%
 
ValueCountFrequency (%) 
0 2437 7.0%
 
1 3 < 0.1%
 
2 1 < 0.1%
 
3 2 < 0.1%
 
5 1 < 0.1%
 
ValueCountFrequency (%) 
433014 1 < 0.1%
 
146699 1 < 0.1%
 
89030 1 < 0.1%
 
80000 1 < 0.1%
 
76000 1 < 0.1%
 

BuildingArea
Real number (ℝ≥0)

MISSING
SKEWED
Distinct count740
Unique (%)5.4%
Missing21115
Missing (%)60.6%
Infinite0
Infinite (%)0.0%
Mean160.2564004
Minimum0
Maximum44515
Zeros76
Zeros (%)0.2%
Memory size272.4 KiB

Quantile statistics

Minimum0
5-th percentile56
Q1102
median136
Q3188
95-th percentile310
Maximum44515
Range44515
Interquartile range (IQR)86

Descriptive statistics

Standard deviation401.2670601
Coefficient of variation (CV)2.50390661
Kurtosis10877.52575
Mean160.2564004
Median Absolute Deviation (MAD)68.00017941
Skewness99.13257937
Sum2202243.454
Variance161015.2535
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
120 185 0.5%
 
100 161 0.5%
 
110 159 0.5%
 
130 153 0.4%
 
115 149 0.4%
 
140 142 0.4%
 
150 136 0.4%
 
160 123 0.4%
 
112 123 0.4%
 
125 119 0.3%
 
Other values (730) 12292 35.3%
 
(Missing) 21115 60.6%
 
ValueCountFrequency (%) 
0 76 0.2%
 
0.01 1 < 0.1%
 
1 15 < 0.1%
 
2 20 0.1%
 
3 25 0.1%
 
ValueCountFrequency (%) 
44515 1 < 0.1%
 
6791 1 < 0.1%
 
6178 1 < 0.1%
 
4645 1 < 0.1%
 
3647 1 < 0.1%
 

YearBuilt
Real number (ℝ≥0)

MISSING
Distinct count160
Unique (%)1.0%
Missing19306
Missing (%)55.4%
Infinite0
Infinite (%)0.0%
Mean1965.289885
Minimum1196
Maximum2106
Zeros0
Zeros (%)0.0%
Memory size272.4 KiB

Quantile statistics

Minimum1196
5-th percentile1900
Q11940
median1970
Q32000
95-th percentile2013
Maximum2106
Range910
Interquartile range (IQR)60

Descriptive statistics

Standard deviation37.32817802
Coefficient of variation (CV)0.01899372622
Kurtosis10.89861685
Mean1965.289885
Median Absolute Deviation (MAD)30.22706858
Skewness-1.080913147
Sum30562223
Variance1393.392875
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1970 1490 4.3%
 
1960 1260 3.6%
 
1950 1089 3.1%
 
1980 726 2.1%
 
1900 606 1.7%
 
2000 571 1.6%
 
1920 545 1.6%
 
1930 531 1.5%
 
1910 460 1.3%
 
1890 444 1.3%
 
Other values (150) 7829 22.5%
 
(Missing) 19306 55.4%
 
ValueCountFrequency (%) 
1196 1 < 0.1%
 
1800 1 < 0.1%
 
1820 1 < 0.1%
 
1830 1 < 0.1%
 
1850 4 < 0.1%
 
ValueCountFrequency (%) 
2106 1 < 0.1%
 
2019 1 < 0.1%
 
2018 4 < 0.1%
 
2017 82 0.2%
 
2016 130 0.4%
 

CouncilArea
Categorical

Distinct count33
Unique (%)0.1%
Missing3
Missing (%)< 0.1%
Memory size272.4 KiB
Boroondara City Council
 
3675
Darebin City Council
 
2851
Moreland City Council
 
2122
Glen Eira City Council
 
2006
Melbourne City Council
 
1952
Other values (28)
22248
ValueCountFrequency (%) 
Boroondara City Council 3675 10.5%
 
Darebin City Council 2851 8.2%
 
Moreland City Council 2122 6.1%
 
Glen Eira City Council 2006 5.8%
 
Melbourne City Council 1952 5.6%
 
Banyule City Council 1861 5.3%
 
Moonee Valley City Council 1791 5.1%
 
Bayside City Council 1764 5.1%
 
Brimbank City Council 1593 4.6%
 
Monash City Council 1466 4.2%
 
Other values (23) 13773 39.5%
 

Length

Max length30
Mean length21.73256448
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 20 54.1%
 
Uppercase_Letter 16 43.2%
 
Space_Separator 1 2.7%
 
ValueCountFrequency (%) 
Latin 36 97.3%
 
Common 1 2.7%
 
ValueCountFrequency (%) 
ASCII 37 100.0%
 

Lattitude
Real number (ℝ)

MISSING
Distinct count13402
Unique (%)49.9%
Missing7976
Missing (%)22.9%
Infinite0
Infinite (%)0.0%
Mean-37.8106343
Minimum-38.19043
Maximum-37.3902
Zeros0
Zeros (%)0.0%
Memory size272.4 KiB

Quantile statistics

Minimum-38.19043
5-th percentile-37.9485
Q1-37.86295
median-37.8076
Q3-37.7541
95-th percentile-37.67519
Maximum-37.3902
Range0.80023
Interquartile range (IQR)0.10885

Descriptive statistics

Standard deviation0.09027890451
Coefficient of variation (CV)-0.002387659086
Kurtosis1.544527049
Mean-37.8106343
Median Absolute Deviation (MAD)0.06903697867
Skewness-0.2576614223
Sum-1016387.66
Variance0.008150280599
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-37.8361 25 0.1%
 
-37.8424 22 0.1%
 
-37.8198 20 0.1%
 
-37.7956 20 0.1%
 
-37.8414 18 0.1%
 
-37.7969 18 0.1%
 
-37.8536 17 < 0.1%
 
-37.7941 17 < 0.1%
 
-37.7634 17 < 0.1%
 
-37.8127 16 < 0.1%
 
Other values (13392) 26691 76.6%
 
(Missing) 7976 22.9%
 
ValueCountFrequency (%) 
-38.19043 1 < 0.1%
 
-38.1856 1 < 0.1%
 
-38.18463 1 < 0.1%
 
-38.18418 1 < 0.1%
 
-38.18415 1 < 0.1%
 
ValueCountFrequency (%) 
-37.3902 1 < 0.1%
 
-37.3951 1 < 0.1%
 
-37.3978 1 < 0.1%
 
-37.39946 1 < 0.1%
 
-37.40349 1 < 0.1%
 

Longtitude
Real number (ℝ≥0)

MISSING
Distinct count14524
Unique (%)54.0%
Missing7976
Missing (%)22.9%
Infinite0
Infinite (%)0.0%
Mean145.0018511
Minimum144.42379
Maximum145.52635
Zeros0
Zeros (%)0.0%
Memory size272.4 KiB

Quantile statistics

Minimum144.42379
5-th percentile144.80008
Q1144.9335
median145.0078
Q3145.0719
95-th percentile145.1877
Maximum145.52635
Range1.10256
Interquartile range (IQR)0.1384

Descriptive statistics

Standard deviation0.1201687692
Coefficient of variation (CV)0.0008287395521
Kurtosis1.545947474
Mean145.0018511
Median Absolute Deviation (MAD)0.08891049626
Skewness-0.3948800169
Sum3897794.76
Variance0.01444053308
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
144.9966 21 0.1%
 
144.991 17 < 0.1%
 
144.985 17 < 0.1%
 
145.0104 17 < 0.1%
 
144.9679 16 < 0.1%
 
144.9911 16 < 0.1%
 
145.0001 16 < 0.1%
 
145.0243 16 < 0.1%
 
144.997 15 < 0.1%
 
144.9999 15 < 0.1%
 
Other values (14514) 26715 76.6%
 
(Missing) 7976 22.9%
 
ValueCountFrequency (%) 
144.42379 1 < 0.1%
 
144.43162 1 < 0.1%
 
144.43181 1 < 0.1%
 
144.4394 1 < 0.1%
 
144.44051 1 < 0.1%
 
ValueCountFrequency (%) 
145.52635 1 < 0.1%
 
145.5237 1 < 0.1%
 
145.51137 1 < 0.1%
 
145.48985 1 < 0.1%
 
145.48273 1 < 0.1%
 

Regionname
Categorical

Distinct count8
Unique (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size272.4 KiB
Southern Metropolitan
11836
Northern Metropolitan
9557
Western Metropolitan
6799
Eastern Metropolitan
4377
South-Eastern Metropolitan
 
1739
Other values (3)
 
546
ValueCountFrequency (%) 
Southern Metropolitan 11836 34.0%
 
Northern Metropolitan 9557 27.4%
 
Western Metropolitan 6799 19.5%
 
Eastern Metropolitan 4377 12.6%
 
South-Eastern Metropolitan 1739 5.0%
 
Eastern Victoria 228 0.7%
 
Northern Victoria 203 0.6%
 
Western Victoria 115 0.3%
 
(Missing) 3 < 0.1%
 

Length

Max length26
Mean length20.85477809
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 13 61.9%
 
Uppercase_Letter 6 28.6%
 
Space_Separator 1 4.8%
 
Dash_Punctuation 1 4.8%
 
ValueCountFrequency (%) 
Latin 19 90.5%
 
Common 2 9.5%
 
ValueCountFrequency (%) 
ASCII 21 100.0%
 

Propertycount
Real number (ℝ≥0)

Distinct count342
Unique (%)1.0%
Missing3
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean7572.888306
Minimum83
Maximum21650
Zeros0
Zeros (%)0.0%
Memory size272.4 KiB

Quantile statistics

Minimum83
5-th percentile2185
Q14385
median6763
Q310412
95-th percentile15510
Maximum21650
Range21567
Interquartile range (IQR)6027

Descriptive statistics

Standard deviation4428.090313
Coefficient of variation (CV)0.5847293839
Kurtosis0.8906876388
Mean7572.888306
Median Absolute Deviation (MAD)3495.245885
Skewness0.9921002749
Sum263945449
Variance19607983.82
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
21650 844 2.4%
 
8870 722 2.1%
 
10969 583 1.7%
 
14949 552 1.6%
 
10412 491 1.4%
 
14577 485 1.4%
 
10331 467 1.3%
 
10579 456 1.3%
 
11918 444 1.3%
 
14887 435 1.2%
 
Other values (332) 29375 84.3%
 
ValueCountFrequency (%) 
83 1 < 0.1%
 
121 1 < 0.1%
 
129 1 < 0.1%
 
242 1 < 0.1%
 
249 5 < 0.1%
 
ValueCountFrequency (%) 
21650 844 2.4%
 
17496 204 0.6%
 
17384 20 0.1%
 
17093 47 0.1%
 
17055 123 0.4%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

SuburbAddressRoomsTypePriceMethodSellerGDateDistancePostcodeBedroom2BathroomCarLandsizeBuildingAreaYearBuiltCouncilAreaLattitudeLongtitudeRegionnamePropertycount
0Abbotsford68 Studley St2hNaNSSJellis3/09/20162.53067.02.01.01.0126.0NaNNaNYarra City Council-37.8014144.9958Northern Metropolitan4019.0
1Abbotsford85 Turner St2h1480000.0SBiggin3/12/20162.53067.02.01.01.0202.0NaNNaNYarra City Council-37.7996144.9984Northern Metropolitan4019.0
2Abbotsford25 Bloomburg St2h1035000.0SBiggin4/02/20162.53067.02.01.00.0156.079.01900.0Yarra City Council-37.8079144.9934Northern Metropolitan4019.0
3Abbotsford18/659 Victoria St3uNaNVBRounds4/02/20162.53067.03.02.01.00.0NaNNaNYarra City Council-37.8114145.0116Northern Metropolitan4019.0
4Abbotsford5 Charles St3h1465000.0SPBiggin4/03/20172.53067.03.02.00.0134.0150.01900.0Yarra City Council-37.8093144.9944Northern Metropolitan4019.0
5Abbotsford40 Federation La3h850000.0PIBiggin4/03/20172.53067.03.02.01.094.0NaNNaNYarra City Council-37.7969144.9969Northern Metropolitan4019.0
6Abbotsford55a Park St4h1600000.0VBNelson4/06/20162.53067.03.01.02.0120.0142.02014.0Yarra City Council-37.8072144.9941Northern Metropolitan4019.0
7Abbotsford16 Maugie St4hNaNSNNelson6/08/20162.53067.03.02.02.0400.0220.02006.0Yarra City Council-37.7965144.9965Northern Metropolitan4019.0
8Abbotsford53 Turner St2hNaNSBiggin6/08/20162.53067.04.01.02.0201.0NaN1900.0Yarra City Council-37.7995144.9974Northern Metropolitan4019.0
9Abbotsford99 Turner St2hNaNSCollins6/08/20162.53067.03.02.01.0202.0NaN1900.0Yarra City Council-37.7996144.9989Northern Metropolitan4019.0

Last rows

SuburbAddressRoomsTypePriceMethodSellerGDateDistancePostcodeBedroom2BathroomCarLandsizeBuildingAreaYearBuiltCouncilAreaLattitudeLongtitudeRegionnamePropertycount
34847Wollert27 Birchmore Rd3h500000.0PIRay24/02/201825.53750.03.02.02.0383.0118.02016.0Whittlesea City Council-37.61940145.03951Northern Metropolitan2940.0
34848Wollert16 Gunther Wy4h621000.0Shockingstuart24/02/201825.53750.04.02.02.0375.0NaNNaNWhittlesea City Council-37.61331145.03412Northern Metropolitan2940.0
34849Wollert35 Kingscote Wy3h570000.0SPRW24/02/201825.53750.03.02.02.0404.0158.02012.0Whittlesea City Council-37.61031145.03393Northern Metropolitan2940.0
34850Wollert15 Rockgarden Wy3hNaNSPLJ24/02/201825.53750.03.02.02.0268.0135.02016.0Whittlesea City Council-37.61094145.04281Northern Metropolitan2940.0
34851Yarraville78 Bayview Rd3h1101000.0SJas24/02/20186.33013.03.01.0NaN288.0NaNNaNMaribyrnong City Council-37.81095144.88516Western Metropolitan6543.0
34852Yarraville13 Burns St4h1480000.0PIJas24/02/20186.33013.04.01.03.0593.0NaNNaNMaribyrnong City Council-37.81053144.88467Western Metropolitan6543.0
34853Yarraville29A Murray St2h888000.0SPSweeney24/02/20186.33013.02.02.01.098.0104.02018.0Maribyrnong City Council-37.81551144.88826Western Metropolitan6543.0
34854Yarraville147A Severn St2t705000.0SJas24/02/20186.33013.02.01.02.0220.0120.02000.0Maribyrnong City Council-37.82286144.87856Western Metropolitan6543.0
34855Yarraville12/37 Stephen St3h1140000.0SPhockingstuart24/02/20186.33013.0NaNNaNNaNNaNNaNNaNMaribyrnong City CouncilNaNNaNWestern Metropolitan6543.0
34856Yarraville3 Tarrengower St2h1020000.0PIRW24/02/20186.33013.02.01.00.0250.0103.01930.0Maribyrnong City Council-37.81810144.89351Western Metropolitan6543.0